Modeling Annotators: A Generative Approach to Learning from Annotator Rationales

نویسندگان

Omar Zaidan

Jason Eisner

چکیده

A human annotator can provide hints to a machine learner by highlighting contextual “rationales” for each of his or her annotations (Zaidan et al., 2007). How can one exploit this side information to better learn the desired parameters θ? We present a generative model of how a given annotator, knowing the true θ, stochastically chooses rationales. Thus, observing the rationales helps us infer the true θ. We collect substring rationales for a sentiment classification task (Pang and Lee, 2004) and use them to obtain significant accuracy improvements for each annotator. Our new generative approach exploits the rationales more effectively than our previous “masking SVM” approach. It is also more principled, and could be adapted to help learn other kinds of probabilistic classifiers for quite different tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning with Annotator Rationales to Reduce Annotation Cost

We review two novel methods for text categorization, based on a new framework that utilizes richer annotations that we call annotator rationales. A human annotator provides hints to a machine learner by highlighting contextual “rationales” in support of each of his or her annotations. We have collected such rationales, in the form of substrings, for an existing document sentiment classification...

متن کامل

Modeling annotator expertise: Learning when everybody knows a bit of something

Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this problem when annotators may be unreliable (labels are noisy), but also their expertise varies depending on the data they observe (annotators may have knowledge about different parts of the input space). That is, an anno...

متن کامل

Automatically Generating Annotator Rationales to Improve Sentiment Classification

One of the central challenges in sentimentbased text categorization is that not every portion of a document is equally informative for inferring the overall sentiment of the document. Previous research has shown that enriching the sentiment labels with human annotators’ “rationales” can produce substantial improvements in categorization performance (Zaidan et al., 2007). We explore methods to a...

متن کامل

Active Learning from Multiple Knowledge Sources

Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little at...

متن کامل

Learning from Multiple Annotators with Gaussian Processes

In many supervised learning tasks it can be costly or infeasible to obtain objective, reliable labels. We may, however, be able to obtain a large number of subjective, possibly noisy, labels from multiple annotators. Typically, annotators have different levels of expertise (i.e., novice, expert) and there is considerable diagreement among annotators. We present a Gaussian process (GP) approach ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Modeling Annotators: A Generative Approach to Learning from Annotator Rationales

نویسندگان

چکیده

منابع مشابه

Machine Learning with Annotator Rationales to Reduce Annotation Cost

Modeling annotator expertise: Learning when everybody knows a bit of something

Automatically Generating Annotator Rationales to Improve Sentiment Classification

Active Learning from Multiple Knowledge Sources

Learning from Multiple Annotators with Gaussian Processes

عنوان ژورنال:

اشتراک گذاری